Priors and Posteriors

Elizabeth King
Kevin Middleton

Priors

Priors let us model with knowledge included:

  • Almost no knowledge
    • Carp either pass or do not pass
  • A little knowledge
    • We guess that the proportion is probably less than 50%
  • A lot of knowledge
    • Previous studies show that <25% of carp pass

Distributions of proportions (and probabilities)

  • Bounded by 0 and 1
  • Continuous (any proportion is possible)

The Beta distribution fits these requirements

  • Two parameters
    • a, \(\alpha\), shape1 in dbeta()
    • b, \(\beta\), shape2 in dbeta()

Beta distribution: a = 1, b = 1

  • dbeta(x, shape1, shape2) is the density of the Beta distribution for a vector x and the two shape parameters
  • Create an evenly spaced sequence of P probabilities from 0 to 1
  • Calculate the density for each P, given shape1 and shape2
ggplot(data = tibble(P = seq(0, 1, length.out = 200),
                     Density = dbeta(P, shape1 = 1, shape2 = 1)),
       aes(P, Density)) +
  geom_line()

Beta distribution: a = 1, b = 1

Beta distribution: a = 2, b = 2

ggplot(data = tibble(P = seq(0, 1, length.out = 200),
                     Density = dbeta(P, shape1 = 2, shape2 = 2)),
       aes(P, Density)) +
  geom_line()

Beta distribution: a = 2, b = 2

Interactive Beta distribution

Beta distribution: a = 0.5, b = 0.5

Beta distribution for carp observations

  • Start with a Beta(1, 1) prior
    • All we know is that carp either Pass or Do Not Pass

Begin observing carp

  Pass FishID       Date
1    0   5145 2017-07-12
2    0   5259 2017-07-12
3    0   5275 2017-07-12
4    0   5335 2017-07-12
5    0   5345 2017-07-12
6    0   5395 2017-07-12
  • If Pass is 1: add 1 to a
  • If Pass is 0: add 1 to b

Carp 1: No pass

Carp 2: No pass

Carp 3: No pass

Carp 7: Pass

Carp 8: No pass

Carp 9: Pass

All the carp

ab <- tibble(a = rep(1, 72),
             b = rep(1, 72),
             Pass = NA)

for (ii in 2:nrow(ab)) {
  ab$Pass[ii] <- Carp$Pass[ii - 1]
  if (Carp$Pass[ii - 1] == 0) {
    ab$a[ii] <- ab$a[ii - 1]
    ab$b[ii] <- ab$b[ii - 1] + 1
  } else {
    ab$a[ii] <- ab$a[ii - 1] + 1
    ab$b[ii] <- ab$b[ii - 1]
  }
}

print(ab, n = 20)

All the carp

# A tibble: 72 × 3
       a     b  Pass
   <dbl> <dbl> <int>
 1     1     1    NA
 2     1     2     0
 3     1     3     0
 4     1     4     0
 5     1     5     0
 6     1     6     0
 7     1     7     0
 8     2     7     1
 9     2     8     0
10     3     8     1
11     3     9     0
12     3    10     0
13     3    11     0
14     3    12     0
15     3    13     0
16     3    14     0
17     3    15     0
18     3    16     0
19     3    17     0
20     4    17     1
# … with 52 more rows

Plotting the first 16 carp

Final distribution of probability

Shuffle the order of observations

set.seed(42364)
Carp <- Carp[sample(seq_len(nrow(Carp))), ]
Carp
   Pass FishID       Date
8     0   5465 2017-07-12
66    0   495D 2018-07-17
64    0   48D5 2018-07-17
35    1   52E9 2017-08-09
65    0   495B 2018-07-17
50    0   52AF 2017-09-20
70    0   54F5 2018-07-17
23    0   5285 2017-08-09
67    0   54AF 2018-07-17
57    1   4895 2018-07-17
2     0   5259 2017-07-12
6     0   5395 2017-07-12
48    0   4ED5 2017-09-20
37    0   544D 2017-08-09
26    0   4D95 2017-08-09
71    0   550B 2018-07-17
14    0   50D5 2017-07-12
63    0   5509 2018-07-17
34    0   529D 2017-08-09
13    0   4EB5 2017-07-12
39    0   5159 2017-09-20
47    0   4E95 2017-09-20
22    0   5265 2017-08-09
38    0   5095 2017-09-20
17    0   53AD 2017-07-12
12    0   4D75 2017-07-12
19    1   5115 2017-08-09
56    0   542D 2017-09-20
20    0   5125 2017-08-09
46    0   5459 2017-09-20
58    1   4915 2018-07-17
42    0   5359 2017-09-20
44    0   5415 2017-09-20
33    0   526B 2017-08-09
5     0   5345 2017-07-12
3     0   5275 2017-07-12
7     1   5429 2017-07-12
10    0   4D65 2017-07-12
16    0   52D7 2017-07-12
36    0   53B5 2017-08-09
41    0   5249 2017-09-20
51    0   52BB 2017-09-20
61    0   4959 2018-07-17
25    0   5315 2017-08-09
69    0   54EB 2018-07-17
68    1   54DD 2018-07-17
1     0   5145 2017-07-12
30    0   51A5 2017-08-09
60    1   4929 2018-07-17
55    0   52EB 2017-09-20
45    0   5435 2017-09-20
52    0   52D9 2017-09-20
28    0   4EA5 2017-08-09
9     1   5469 2017-07-12
4     0   5335 2017-07-12
53    0   52DB 2017-09-20
24    0   5291 2017-08-09
15    0   526D 2017-07-12
32    1   522B 2017-08-09
59    0   4925 2018-07-17
27    0   4DD5 2017-08-09
54    0   52E5 2017-09-20
18    0   53D5 2017-07-12
49    0   525B 2017-09-20
62    0   5489 2018-07-17
29    0   512D 2017-08-09
43    0   5369 2017-09-20
11    0   4D6B 2017-07-12
21    0   5149 2017-08-09
40    0   5225 2017-09-20
31    0   51B5 2017-08-09

Repeat and plot

Comparing two posteriors

The order of the data does not matter

Informative priors

  • Almost no knowledge
    • Carp either pass or do not pass
    • a = 1; b = 1
  • A little knowledge
    • We guess that the proportion is probably less than 50%
    • a = 5; b = 15
  • A lot of knowledge
    • Previous studies show that <25% of carp pass
    • a = 5; b = 40

Informative priors

Posteriors from informative priors

Different priors = Different posteriors

Frequentist confidence intervals (see Morey et al. 2016)

  • 95% CI for a proportion of 9 / 72 is 0.07 - 0.22 (via Wilson’s method)
  • “If we repeated this experiment over and over, 95% of those experiments would have a proportion between 0.07 and 0.22.”

Not:

  • 95% certain that the proportion is between 0.07 and 0.22
  • 95% probability that the proportion is between 0.07 and 0.22

Bayesian highest density interval

  • 95% highest density interval (HDI) for the proportion (given a Beta(1, 1) prior and the data) is 0.06 to 0.22

“There is a 95% probability that the true proportion falls between 0.06 and 0.22”.

Advantages of Bayesian inference 1

  • Randomness or noise is a property of data not of sampling
    • Is there one single population body mass of rabbits in a field?
  • Include prior knowledge in the model
  • Posteriors represent the relative plausibility of estimates (given a model, data, and priors)
  • Posteriors can be interpreted as probabilities

Advantages of Bayesian inference 2

  • Regularizing priors can make complex models feasible
    • Multilevel/mixed models
    • Generalized linear models
    • Nonlinear models

References

Morey, Richard D, Rink Hoekstra, Jeffrey N Rouder, Michael D Lee, and Eric-Jan Wagenmakers. 2016. “The Fallacy of Placing Confidence in Confidence Intervals.” Psychon. Bull. Rev. 23 (1): 103–23. https://doi.org/10.3758/s13423-015-0947-8.